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Network analysis is entering fields where network structures are unknown, such as psychology and the 
educational sciences. A crucial step in the application of network models lies in the assessment of network 
structure. Current methods either have serious drawbacks or are only suitable for Gaussian data. In the 
present paper, we present a method for assessing network structures from binary data. Although models for 
binary data are infamous for their computational intractability, we present a computationally efficient 
model for estimating network structures. The approach, which is based on Ising models as used in physics, 
combines logistic regression with model selection based on a Goodness-of-Fit measure to identify relevant 
relationships between variables that define connections in a network. A validation study shows that this 
method succeeds in revealing the most relevant features of a network for realistic sample sizes. We apply our 
proposed method to estimate the network of depression and anxiety symptoms from symptom scores of 
1 108 subjects. Possible extensions of the model are discussed. 

Research on complex networks is growing and statistical possibilities to analyse network structures have 
been developed to great success in the past decade 15 . Networks are studied in many different scientific 
disciplines: from physics and mathematics to the social sciences and biology. Examples of topics that 
have recently been subjected to network approaches include intelligence, psychopathology, and attitudes 6 " 10 . 
Taking psychopathology as an example, nodes (elements) in a depression network may involve symptoms, 
whereas edges (connections) indicate to what extent symptoms influence each other. The structure of such a 
network, however, is unknown due to the absence of a sufficiently formalised theory of depression. 
Consequently, the network structure has to be extracted from information in data. The challenging question 
is how to extract it. 

Methods that are currently used to discover network structures in the field of psychology are based on 
correlations, partial correlations, and patterns of conditional independencies 7,11 " 13 . Although such techniques 
are useful to get a first impression of the data, they suffer from a number of drawbacks. Correlations and partial 
correlations, for example, require assumptions of linearity and normality, which are rarely satisfied in psychology, 
and necessarily false for binary data. Algorithms like the PC-algorithm 1415 , which can be used to search for causal 
structure, often assume that networks are directed and acyclic, which is unlikely in many psychological cases. 
Finally, in any of these methods, researchers rely on arbitrary cutoffs to determine whether a network connection 
is present or not. A common way to determine such cutoff- values is through null-hypothesis testing, which often 
depends on the arbitrary level of significance of a = .05. In the case of network analysis, however, one often has to 
execute a considerable number of significance tests. One can either ignore this, which will lead to a multiple 
testing problem, or deal with it through Bonferonni corrections, (local) false discovery rate, or other methods 16-18 , 
which will lead to a loss of power. 

For continuous data with multivariate Gaussian distributed observations, the inverse covariance matrix is a 
representation of an undirected network (also called a Markov Random Field 15,20 ). A zero entry in the inverse 
covariance matrix then corresponds to the presence of conditional independence between the relevant variables, 
given the other variables 21 . To find the simplest model that explains the data as adequately as possible according to 
the principle of parsimony, different strategies are investigated to find a sparse approximation of the inverse 
covariance matrix. Such a sparse approximation can be obtained by imposing an € 1 -penalty (lasso) on the 
estimation of the inverse covariance matrix 13,22 ' 23 . The lasso ensures shrinkage of partial correlations and puts 
others exactly to zero 24 . A different take involves estimating the neighborhood of each variable individually, as in 
standard regression with an € i -penalty 25 , instead of using the inverse covariance matrix. This is an approximation 
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to the € 1 -penalised inverse covariance matrix. This Gaussian 
approximation method is an interesting alternative: it is computa- 
tionally efficient and asymptotically consistent 25 . 

In psychology and educational sciences, variables are often not 
Gaussian but discrete. Although discrete Markov Random Fields 
are infamous for their computational intractability, we propose a 
binary equivalent of the Gaussian approximation method that 
involves regressions and is computationally efficient 26 . This method 
for binary data, which we describe in more detail in the Methods 
section, is based on the Ising model 1 ' 27 . In this model, variables can 
be in either of two states, and interactions are at most pairwise. The 
model contains two node-specific parameters: the interaction para- 
meter fijk, which represents the strength of the interaction between 
variable j and k, and the node parameter x p which represents the 
autonomous disposition of the variable to take the value one, regard- 
less of neighbouring variables. Put simply, the proposed procedure in 
our model estimates these parameters with logistic regressions: itera- 
tively, one variable is regressed on all others. However, to obtain 
sparsity, an € r penalty is imposed on the regression coefficients. 
The level of shrinkage depends on the penalty parameter of the lasso. 
The penalty parameter has to be selected carefully, otherwise the 
lasso will not lead to the true underlying network - the data generat- 
ing network 25 . The extended Bayesian Information Criterion 28 
(EBIC) has been shown to lead to the true network when sample size 
grows and results in a moderately good positive selection rate, but 
performs distinctly better than other measures in having a low false 
positive rate 29 . 

Using this approach, we have developed a coherent methodology 
that we call eLasso. The methodology is implemented in the freely 
available R package IsingFit (http://cran.r-project.org/web/packages/ 
IsingFit/IsingFit.pdf). Using simulated weighted networks, the pre- 
sent paper studies the performance of this procedure by investigating 
to what extent the methodology succeeds in estimating networks 
from binary data. We simulate data from different network archi- 
tectures (i.e., true networks; see Figures la and lb), and then use the 
resulting data as input for eLasso. The network architectures used in 
this study involve random, scale-free, and small word networks 3032 . 
In addition, we varied the size of the networks by including condi- 
tions with 10, 20, 30, and 100 nodes, and involve three levels of 
connectivity (low, medium, and high). Finally, we varied the sample 
size between 100, 500, 1000, and 2000 observations. After applying 
eLasso, we compare the estimated networks (Figure lc) to the true 
networks. We show that eLasso reliably estimates network structures, 
and demonstrate the utility of our method by applying it to psycho- 
pathology data. 

Results 

Validation study. The estimated networks show high concordance 
with the true networks used to generate the data (Figure 2). Average 
correlations between true and estimated coefficients are high in all 
conditions with 500 observations or more (M = .883, sd = .158, see 
Table 1). In the smallest sample size condition involving only 100 
observations, the estimated networks seems to deviate somewhat 
more from the true networks, but even in this case the most 
important connections are recovered and the average correlation 
between generating and estimated networks remains substantial 
(M = .556, sd = .155). Thus, the overall performance of eLasso is 
adequate. 

More detailed information about eLasso' s performance is given by 
sensitivity and specificity. Sensitivity expresses the proportion of true 
connections which are correctly estimated as present, and is also 
known as the true positive rate. Specificity corresponds to the pro- 
portion of absent connections which are correctly estimated as zero, 
and is also known as the true negative rate. It has been shown that 
sensitivity and specificity tend to 1 when sample sizes are large 
enough 29,33 ; the question is for which sample sizes we come close. 



Overall, specificity is very close to one across all conditions (M = 
.990, sd = .0 14) with somewhat lower specificity scores for the largest 
and most dense random networks (see Table 2). Overall, sensitivity is 
lower (M = .463, sd = .238) but becomes moderate for conditions 
involving more than 100 observations (M = .568, sd = .171). The 
reason that sensitivity is lower than specificity lies in the use of the 
penalty function (lasso); to manage the size of the computational 
problem, eLasso tends to suppress small but nonzero connections 
towards zero. Thus, lower sensitivity values mainly reflect the fact 
that very weak connections are set to zero; however, the important 
connections are almost aways correctly identified. In addition, the 
specificity results indicate that there are very few false positives in the 
estimated networks; thus, eLasso handles the multiple testing prob- 
lem very well. Figure 1 nicely illustrates these results: almost all 
estimated connections in Figure lc are also present in the generating 
network depicted in Figure lb (high specificity), but weaker connec- 
tions in the original network are underestimated (low sensitivity). 

The above pattern of results, involving adequate network recovery 
with high specificity and moderately high sensitivity, is represent- 
ative for almost all simulated conditions. The only exception to this 
rule results when the largest random and scale-free networks (100 
nodes) are coupled with the highest level of connectivity. In these 
cases, the estimated coefficients show poor correlations with the 
coefficients of the generating networks, even for conditions involving 
2000 observations (.222 and .681, respectively). For random net- 
works, the reason for this is that the number of connections increases 
as the level of connectivity increases. For scale-free networks, the 
number of connections does not increase with increasing level of 
connectivity, but it does result in a peculiar arrangement of network 
connections, in which one node comes to have disproportionately 
many connections. Because eLasso penalises variables for having 
more connections, larger sample sizes are needed to overcome this 
penalty for these types of networks. 

Although the lower level of sensitivity is partly inherent in the 
chosen method to handle the computational size of the problem 
and the solution to multiple testing through penalisation, it might 
be desirable in some cases to have a higher sensitivity at the expense 
of specificity. In eLasso, sensitivity can generally be increased in two 
ways. First, eLasso identifies the set of neighbours for each node by 
computing the EBIC 28 (extended BIC). EBIC penalises solutions that 
involve more variables and more neighbours. This means that if the 
number of variables is high, EBIC tends to favour solutions that 
assign fewer neighbours to any given node. In this procedure, a 
hyperparameter called y determines the strength of the extra penalty 
on the number of neighbours 29,33 . In our main simulation study, we 
used y = .25. When y = 0, no extra penalty is given for the number of 
neighbours, which results in a greater number of estimated connec- 
tions. Second, we applied the so-called AND-rule to determine the 
final edge set. The AND-rule requires both regression coefficients /J* 
and fS]y (from the €!-regularised logistic regression ofX ; onX^ and of 
Xk on Xj) to be nonzero. Alternatively, the OR-rule can be applied. 
The OR-rule requires only one of and to be nonzero, which 
also results in more estimated connections. 

By applying the OR-rule and y = 0, correlations between true and 
estimated coefficients are even higher in all conditions with 500 
observations and more (M = .895, sd = .156; Table 1). Sensitivity 
also improved across all conditions (M = .584, sd = .221; Table 2). 
With more than 100 observations, average sensitivity is higher (M = 
.682, sd = .153). Applying the OR-rule and setting y = 0 thus indeed 
increases the sensitivity of eLasso. As expected, this gain in sensitivity 
results in a loss of specificity; however, this loss is slight, as specificity 
remains high across all conditions (M = .956, sd = .039; Table 2). 

Finally, it should be noted that with sparse networks, specificity 
partly takes on high values due to the low base rate of connections, 
since it is based on the number of true negatives. Therefore, we also 
investigated another measure, the so-called Fl score, that is not based 
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Figure 1 | Examples of networks with 30 nodes in the simulation study, (a) Generated networks. From left to right: random network (probability of 
an extra connection is 0.1), scale-free network (power of preferential attachment is 1) and small world network (rewiring probability is 0.1). 
(b) Weighted versions of (a) that are used to generate data (true networks), (c) Estimated networks. 



on true negatives but on true positives, false positives and false nega- 
tives 34 ; as such, it is independent of the base rate. For most conditions, 
the trends in the results are comparable. However, for larger and/or 
more dense random networks, the proportion of estimated connec- 
tions that are not present in the true network is larger. More details 
about these results are provided in the online Supplementary 
Information. 

To conclude, eLasso proves to be an adequate method to estimate 
networks from binary data. The validation study indicates that, with 
sample sizes of 500, 1000, and 2000, the estimated network strongly 
resembles the true network (high correlations). Specificity is uni- 
formly high across conditions, which means there is a near absence 
of false positives among estimated network connections. Sensitivity 
is moderately high, and increases with sample size. For the most part, 



sensitivity is lowered because of weak connections that are incor- 
rectly set to zero; in these cases, however, eLasso still adequately picks 
up the most important connectivity structures. For larger networks 
with either higher connectivity or a higher level of preferential 
attachment, sensitivity becomes lower; in these cases, more observa- 
tions are needed. 

Application to real data. To demonstrate the utility of eLasso, we 
apply it to a large data set (N = 1108) containing measurements of 
depression of healthy controls and patients with a current or history 
of depressive disorder. We used 27 items of the Inventory of 
Depressive Symptomatology 35 , which was administered in the 
Netherlands Study of Depression and Anxiety 36 (NESDA). Using 
eLasso, we investigate how individual depression symptoms are 



SCIENTIFIC REPORTS | 4:5918 | DOI: 1 0. 1 038/srep0591 8 



3 



Random 



Scale free 



Small world 




S size = 100 Ssize = 500 S siZ e = 1000 S si ze = 2000 

Figure 2 | Mean correlations (vertical axes) of the upper triangles of the weighted adjacency matrices of true and estimated networks of 1 00 simulations 
with random, scale-free, and small world networks for sample sizes s sl2t , — 100, 500, 1000, and 2000, with number of nodes n nod<;s — 10, 20, 30, and 100. 

We used three levels of connectivity (random networks: probability of an extra connection P conn = . 1 , .2, and .3; scale-free networks: power of preferential 
attachment P attac h = 1, 2, and 3; small world networks: rewiring probability of P rew i rc = .1, .5, and 1). For the condition with 100 nodes, we used 
different levels of connectivity for random and scale-free networks in order to obtain more realistic networks (random networks: P conn = .05, .1, and .15; 
scale-free networks: P aaa ch = 1> 1-25, and 1.5). 



related, as this may reveal which symptoms are important in the 
depression network; in turn, this information may be used to 
identify targets for intervention in clinical practice. 

The eLasso network for these data is given in Figure 3. To analyse 
the depression network, we focus on the most prominent properties 
of nodes in a network: node strength, betweenness, and clustering 
coefficient (Figure 4). Node strength is a measure of the number of 
connections a node has, weighted by the eLasso coefficients 37 . 
Betweenness measures how often a node lies on the shortest path 
between every combination of two other nodes, indicating how 
important the node is in the flow of information through the net- 
work 38,39 . The local clustering coefficient is a measure of the degree to 
which nodes tend to cluster together. It is defined as how often a node 
forms a triangle with its direct neighbours, proportional to the num- 
ber of potential triangles the relevant node can form with its direct 
neighbours 38 . These measures are indicative of the potential spread- 
ing of activity through the network. As activated symptoms can 
activate other symptoms, a more densely connected network facil- 
itates symptom activation. Moreover, we inspect the community 
structure of the networks derived from the empirical data, to identify 
clusters of symptoms that are especially highly connected. 

Figure 3 reveals that most cognitive depressive symptoms (e.g., 
"feeling sad" (sad), "feeling irritable" (irr), "quality of mood" (qmo), 
"response of your mood to good or desired events" (rmo), "concen- 
tration problems" (con), and "self criticism and blame" (sel)) seem to 
be clustered together. These symptoms also seem to score moderate 
to high on at least two out of three centrality measures (Figure 4). For 
example, "rmo" has a moderate strength and a very high clustering 
coefficient, whereas it has a low betweenness. This indicates that 
activation in the network does not easily affect response of mood 



to positive events (low betweenness), but that, if the symptom is 
activated, the cluster will tend to stay infected because of the high 
interconnectivity (high clustering coefficient). Another interesting 
example is "energy level" (ene), which has a high node strength 
and betweenness, but a moderate clustering coefficient. Apparentiy, 
energy level has many and/or strong connections (high strength) and 
lies on many paths between symptoms (high betweenness), whereas it 
is not part of a strongly clustered group of symptoms (moderate 
clustering coefficient). This symptom is probably more important 
in passing information through the network, or between other clus- 
ters, and might, therefore, be an interesting target for intervention. 

As opposed to cognitive depressive symptoms, most anxiety and 
somatic symptoms (e.g., "panic/phobic symptoms" (pan), "aches 
and pains" (ach), "psychomotor agitation" (agi)) feature low scores 
on at least two centrality measures. Apparently, most anxiety and 
somatic symptoms either are less easily affected by other activated 
symptoms, do not tend to stay infected because of low interconnec- 
tivity (low clustering coefficient), or are less important for transfer- 
ring information through the network (low betweenness). This is to 
be expected, since participants with a current or history of anxiety 
disorder are excluded from our sample. The item "feeling anxious" 
(anx), however, seems to be an important exception; feeling anxious 
does have a high node strength, a relatively high betweenness, and a 
moderate clustering coefficient. Apparently, feeling anxious does 
play an important role in our sample of depressive and healthy 
persons: it can be activated very easily, since a lot of information 
flows through it (high betweenness), and, in turn, it can activate 
many other symptoms because it has many neighbours (high node 
strength, moderate clustering). The role of feeling anxious in our 
network is in line with high comorbidity levels of anxiety and 
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depressive disorders found in the literature 40 42 . Still, feeling anxious 
is not a symptom of depression according to current classifications, 
even though recent adaptations in DSM-5 propose an anxiety speci- 
fier for patients with mood disorders 43 . In line with this, our data 
suggest that people with a depressive disorder experience depressive 
symptoms often also feel anxious, although they may not have an 
anxiety disorder. This supports criticisms of the boundaries between 
MDD and generalised anxiety, which have been argued to be 
artificial 8 . 

Another interesting feature of networks lies in their organization 
in community structures: clusters of nodes that are relatively highly 
connected. In the present data, the Walktrap algorithm 44,45 reveals a 
structure involving six communities (see Figure 5). The purple clus- 
ter contains mostly negative mood symptoms, such as "feeling sad" 
(sad) and "feeling irritable" (irr); the pink cluster contains predomi- 
nantly positive mood symptoms, such as "capacity of pleasure" (pie) 
and "general interest" (int); the green cluster is related to anxiety and 
somatic symptoms, such as "anxiety" (anx) and "aches and pains" 
(ach); the blue and yellow clusters represent sleeping problems. 

Discussion 

eLasso is a computationally efficient method to estimate weighted, 
undirected networks from binary data. The present research indi- 
cates that the methodology performs well in situations that are rep- 
resentative for psychology and psychiatry, with respect to the 
number of available observations and variables. Network architec- 
tures were adequately recovered across simulation conditions and, 
insofar as errors were made, they concerned the suppression of very 
weak edges to zero. Thus, eLasso is a viable methodology to estimate 
network structure in typical research settings in psychology and 
psychiatry and fills the gap in estimating network structures from 
non-Gaussian data. 

Simulations indicated that the edges in the estimated network are 
nearly always trustworthy: the probability of including an edge, that 
is not present in the generating network, is very small even for small 
sample sizes. Due to the use of the lasso, more regression coefficients 
are set to zero in small sample sizes, which results in a more conser- 
vative estimation of network structure. For larger networks that are 
densely connected or that feature one node with a disproportionate 
number of connections, more observations are needed to yield a good 
estimate of the network. As the sample size grows, more and more 
true edges are estimated, in line with the asymptotic consistency of 
the method. 

The model we presented may be extended from its current dicho- 
tomous nature to accommodate ordinal data, which are also preval- 
ent in psychiatric research. For multinomial data, for example, the 
Potts model could be used 46 . This model is a generalisation of the 
Ising model with two states to a model with more than two states. 
Another straightforward extension of the model involves general- 
isation to binary time series data (by conditioning on the previous 
time point to render observations independent). 

Methods 

In this section we briefly explain the newly implemented method eLasso, provide the 
algorithm, describe the validation study and the real data we used to show the utility of 
eLasso. 

eLasso. Let x = [x\, x 2 , x n ) be a configuration where x, = 0 or 1. The conditional 
probability of Xj given all other nodes Xy according to the Ising model 26,47 is given by 



exp 




+ Xj 2 p jk x k 

kEV.j 


1 + exp 


T ; + E &** 

jfceVy 



where it and are the node parameter (or threshold) and the pairwise interaction 
parameter respectively. 



5 



Table 2 | Sensitivity and specificity, as a measure of performance of eLasso. Data is simulated under various conditions [s size , n no j esi 
connectedness (p (probability of a connection), pa (preferential attachment), pr (probability of rewiring)) when the AND-rule and y = .25 
is applied. For networks with 1 00 nodes, deviating levels of connectedness are displayed between brackets. Results of applying eLasso 
with the OR-rule and y = 0 are displayed between brackets 
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(0.324) 


(0.339) 


(0.339) 


(0.315) 


(0.288) 


(0.305) 


(0.359) 


(0.349) 


(0.342) 






SPE 


0.998 


0.997 


0.991 


0.998 


0.998 


0.999 


0.997 


0.995 


0.997 








(0.976) 


(0.961 ) 


(0.933) 


(0.978) 


(0.986) 


(0.987) 


(0.960) 


(0.962) 


(0.958) 




30 


SEN 


0.146 


0.128 


0.1 1 8 


0.146 


0.064 


0.044 


0.160 


0.147 


0.144 








(0.295) 


(0.307) 


(0.242) 


(0.269) 


(0.1 86) 


(0.126) 


(0.328) 


(0.287) 


(0.305) 






SPE 


0.999 


0.996 


0.982 


0.999 


0.999 


0.999 


0.999 


0.999 


0.999 








(0.982) 


(0.956) 


(0.922) 


(0.986) 


(0.99) 


(0.991) 


(0.976) 


(0.978) 


(0.976) 




100 


SEN 


0.080 


0.056 


0.031 


0.081 


0.067 


0.040 


0.121 


0.087 


0.092 








(0.1 86) 


(0.132) 


(0.134) 


(0.1 85) 


(0.139) 


(0.085) 


(0.238) 


(0.205) 


(0.195) 






SPE 


1 .000 


0.990 


0.983 


1.000 


1 .0000 


1.000 


1 .000 


1.000 


1.000 








(0.995) 


(0.962) 


(0.904) 


(0.997) 


(0.997) 


(0.998) 


(0.995) 


(0.995) 


(0.995) 


500 


10 


SEN 


0.550 


0.551 


0.617 


0.561 


0.501 


0.499 


0.650 


0.628 


0.623 








(0.649) 


(0.672) 


(0.704) 


(0.687) 


(0.71 3) 


(0.734) 


(0.726) 


(0.765) 


(0.757) 






SPE 


0.998 


0.993 


0.982 


0.996 


0.997 


0.995 


0.953 


0.964 


0.964 








(0.975) 


(0.945) 


(0.922) 


(0.957) 


(0.958) 


(0.966) 


(0.879) 


(0.869) 


(0.862) 




20 


SEN 


0.539 


0.537 


0.527 


0.492 


0.364 


0.302 


0.569 


0.538 


0.538 








(0.633) 


(0.678) 


(0.643) 


(0.61 3) 


(0.61 9) 


(0.557) 


(0.691 ) 


(0.665) 


(0.676) 






SPE 


0.998 


0.989 


0.971 


0.998 


0.999 


0.999 


0.992 


0.989 


0.990 








(0.976) 


(0.944) 


(0.904) 


(0.980) 


(0.985) 


(0.990) 


(0.945) 


(0.945) 


(0.945) 




30 


SEN 


0.508 


0.498 


0.298 


0.461 


0.260 


0.247 


0.536 


0.505 


0.504 








(0.637) 


(0.620) 


(0.470) 


(0.598) 


(0.465) 


(0.391) 


(0.662) 


(0.639) 


(0.628) 






SPE 


0.997 


0.984 


0.964 


0.999 


0.999 


0.999 


0.996 


0.996 


0.995 








(0.977) 


(0.939) 


(0.879) 


(0.985) 


(0.992) 


(0.994) 


(0.969) 


(0.965) 


(0.965) 




100 


SEN 


0.416 


0.189 


0.091 


0.372 


0.31 1 


0.174 


0.481 


0.433 


0.428 








(0.537) 


(0.336) 


(0.164) 


(0.498) 


(0.420) 


(0.289) 


(0.600) 


(0.554) 


(0.558) 






SPE 


0.999 


0.982 


0.964 


1 .000 


1 .000 


1.000 


0.999 


0.999 


0.999 








(0.989) 


(0.932) 


(0.91 3) 


(0.996) 


(0.997) 


(0.998) 


(0.992) 


(0.992) 


(0.992) 


1000 


10 


SEN 


0.726 


0.671 


0.710 


0.662 


0.620 


0.622 


0.738 


0.751 


0.758 








(0.794) 


(0.752) 


(0.783) 


(0.756) 


(0.781) 


(0.81 8) 


(0.814) 


(0.814) 


(0.820) 






SPE 


0.998 


0.993 


0.979 


0.994 


0.992 


0.995 


0.954 


0.957 


0.956 








(0.974) 


(0.950) 


(0.921 ) 


(0.952) 


(0.966) 


(0.974) 


(0.862) 


(0.867) 


(0.869) 




20 


SEN 


0.666 


0.665 


0.630 


0.599 


0.533 


0.431 


0.680 


0.664 


0.681 








(0.752) 


(0.784) 


(0.770) 


(0.736) 


(0.699) 


(0.709) 


(0.776) 


(0.764) 


(0.772) 






SPE 


0.998 


0.987 


0.968 


0.998 


0.999 


0.999 


0.991 


0.990 


0.988 








(0.976) 


(0.936) 


(0.886) 


(0.977) 


(0.984) 


(0.987) 


(0.946) 


(0.938) 


(0.938) 




30 


SEN 


0.658 


0.603 


0.420 


0.578 


0.389 


0.340 


0.669 


0.663 


0.661 








(0.736) 


(0.710) 


(0.583) 


(0.699) 


(0.566) 


(0.545) 


(0.752) 


(0.738) 


(0.740) 






SPE 


0.996 


0.982 


0.956 


0.999 


0.999 


0.999 


0.995 


0.993 


0.994 








(0.974) 


(0.931) 


(0.870) 


(0.985) 


(0.991) 


(0.993) 


(0.966) 


(0.963) 


(0.961 ) 




100 


SEN 


0.572 


0.286 


0.125 


0.519 


0.409 


0.284 


0.636 


0.579 


0.593 








(0.671) 


(0.427) 


(0.199) 


(0.624) 


(0.544) 


(0.351) 


(0.71 3) 


(0.679) 


(0.680) 






SPE 


0.999 


0.979 


0.957 


1 .000 


1 .000 


1.000 


0.999 


0.999 


0.999 








(0.987) 


(0.91 9) 


(0.908) 


(0.996) 


(0.997) 


(0.998) 


(0.991 ) 


(0.991 ) 


(0.991 ) 


2000 


10 


SEN 


0.71 1 


0.775 


0.810 


0.746 


0.728 


0.712 


0.821 


0.829 


0.822 








(0.808) 


(0.830) 


(0.842) 


(0.842) 


(0.870) 


(0.891) 


(0.880) 


(0.864) 


(0.866) 






SPE 


0.996 


0.994 


0.983 


0.996 


0.995 


0.993 


0.956 


0.960 


0.946 








(0.986) 


(0.951 ) 


(0.921 ) 


(0.955) 


(0.967) 


(0.968) 


(0.871 ) 


(0.873) 


(0.846) 




20 


SEN 


0.741 


0.770 


0.754 


0.691 


0.624 


0.566 


0.793 


0.769 


0.762 








(0.804) 


(0.838) 


(0.840) 


(0.805) 


(0.769) 


(0.754) 


(0.837) 


(0.836) 


(0.844) 






SPE 


0.997 


0.987 


0.962 


0.998 


0.998 


0.999 


0.988 


0.987 


0.986 








(0.977) 


(0.942) 


(0.876) 


(0.977) 


(0.984) 


(0.988) 


(0.942) 


(0.936) 


(0.932) 




30 


SEN 


0.756 


0.740 


0.529 


0.698 


0.483 


0.430 


0.772 


0.762 


0.754 








(0.808) 


(0.808) 


(0.656) 


(0.807) 


(0.712) 


(0.612) 


(0.837) 


(0.818) 


(0.825) 






SPE 


0.996 


0.974 


0.944 


0.999 


0.999 


0.999 


0.994 


0.992 


0.993 








(0.974) 


(0.917) 


(0.851) 


(0.984) 


(0.990) 


(0.993) 


(0.963) 


(0.961) 


(0.959) 




100 


SEN 


0.688 


0.385 


0.160 


0.648 


0.548 


0.349 


0.738 


0.708 


0.703 








(0.767) 


(0.539) 


(0.250) 


(0.736) 


(0.607) 


(0.398) 


(0.793) 


(0.776) 


(0.777) 






SPE 


0.998 


0.973 


0.954 


1.000 


1.000 


1.000 


0.999 


0.999 


0.999 








(0.986) 


(0.906) 


(0.895) 


(0.996) 


(0.997) 


(0.998) 


(0.991) 


(0.990) 


(0.990) 
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ach: aches and pains 

agi: psychomotor agitation 

anx: feeling anxious 

app: change of appetite 

con: concentration problems 

dia: diarrhea/constipation 

ene: energy level 

fal: falling asleep 

fut: view of myself 

hyp: hypersomnia 

int: general interest 

irr: feeling irritable 

pan: panic/phobic symptoms 

par: leaden paralysis 

pie: capacity for pleasure (not sex) 

qmo: quality of mood 

ret: psychomotor retardation 

rmo: respons of mood 

sad: feeling sad 

sel: view of oneself 

sen: interpersonal sensitivity 

sex: interest in sex 

sle: sleep during the night 

sui: suicidal thoughts 

sym: other bodily symptoms 

wak: waking up too early 

wei: change of weight 



Figure 3 | Application of eLasso to real data. The resulting network structure of a group of healthy controls and people with a current or history of 
depressive disorder (N = 1108). Cognitive symptoms are displayed as O and thicker edges (connections) represent stronger associations. 
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Figure 4 | Three centrality measures of the nodes in the network based on real data. From left to right: node strength, betweenness, and clustering 
coefficient. "Hypersomnia" (hyp) has no clustering coefficient, since it has only one neighbour. 
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: J2 ( Z ) X 'J + J2 Pjk x i) x ik - log 1 + exp ) Xj + x 'i=Pjk , 



(4) 



Figure 5 | Community structure of the network based on real data, 
detected by the Walktrap algorithm 44 ' 45 . 



In practice, the graph structure of psychological constructs is unknown. Therefore, 
the estimation of the unknown graph structure and the corresponding parameters is 
of central importance. By viewing X- as the response variable and all other variables Xy 
as the predictors, we may fit a logistic regression function to investigate which nodes 
are part of the neighbourhood of the response variable. The intercept Zj of the 
regression equation is the threshold of the variable, while the slope & fe of the 
regression equation is the connection strength between the relevant nodes. In order to 
perform the logistic regression, we need multiple independent observations of the 
variables. 

To establish which of the variables in the data are neighbours of a given variable, 
and which are not, we used £ y -regularised logistic regression 25,26 . This technique is 
commonly called the lasso (least absolute shrinkage and selection operator) and 
optimises neighbourhood selection in a computationally efficient way, by optimising 
the convex function 



&f = arg min & l - x tj • I ^ + ^ fi jk x } 



■ log 1+ exp<J T;+ Y^**fo \ ft* 



(2) 



in which i represents the independent observations {1,2, .., n], Qj contains all fi^ and 
Xi parameters, and p is the penalty parameter. The final term with p ensures shrinkage 
of the regression coefficients 24,26 . Parameter %t can be interpreted as the tendency of 
the variable to take the value 1, regardless of its neighbours. Parameter represents 
the interaction strength between j and k. 

The optimisation procedure is applied to each variable in turn with all other 
variables as predictors. To this end, the R package glmnet can be used 4S . The glmnet 
package uses a range of maximal 100 penalty parameter values. The result is a list of 
100 possible neighbourhood sets, some of which may be the same. To choose the best 
set of neighbours, we used the EBIC 28 {extended Bayesian Information Criterion). 
The EBIC is represented as 



mCy{j) = -2i 0; +|7|-log(n)+2y|/|-log(p-l), 



(3) 



in which ^0/J is the log likelihood (see below), j/j is the number of neighbours 

selected by logistic regression at a certain penalty parameter p, n is the number of 
observations, p — 1 is the number of covariates (predictors), and y is a hyperpara- 
meter, determining the strength of prior information on the size of the model space 31 . 
The EBIC has been shown to be consistent for model selection and to performs best 
with hyperparameter y — 0.25 for the Ising model 29 . The model with the set of 
neighbours /that has the lowest EBIC is selected. From equation (1), it follows that the 
log likelihood of the conditional probability of X given its neighbours X ne a) over all 
observations is 



At this stage, we have the regression coefficients of the best set of neighbours for every 
variable; i.e., we have both and and have to decide whether there is an edge 
between nodes/ and k or not. Two rules can be applied to make the decision: the AND 
rule, where an edge is present if both estimates are nonzero, and the OR rule, where an 
edge is present if at least one of the estimates is nonzero 25,26 . 

Although we do have the final edge set by applying one of the rules, note that for 
any two variables/ and k, we get two results: the result of the regression of/ on k 
and the result of the regression of k on/ (/?;»)■ To obtain an undirected graph, the 
weight of the edge between nodes j and k, coj k , is defined as the mean of both regression 
coefficients /L. and J?jy. All steps of the described method are summarised in the 
algorithm below and is incorporated in R package IsingFit (http://cran.r-project.org/ 
web/ packages/I singFit/I singFit.p df) . 

Input data set X for p variables and n subjects 

Output (weighted) edge set for all pairs X and X k 

1. Initialise: Select (randomly) one variable from the set of variables. This is the 
dependent variable. 

a. Perform ^-regularised logistic regression on all other variables (glmnet 
uses 100 values of penalty parameter p). 

b. Compute the EBIC for p (i.e., each set of neighbours). 

c. Identify the set of neighbours that yield the lowest EBIC. 

d. Collect the resulting regression parameters in matrix 0 with t on the diag- 
onal and j8 on the off-diagonal. 

e. Repeat steps a through d for all p variables. 

2. Determine the final edge set by applying the AND rule: if both regression 
coefficients and /Jw in 0 are nonzero, then there is an edge between nodes 
/ and k. 

3. Average the weights of the regression coefficients and Define 0* as the 
averaged weighted adjacency matrix with thresholds t on the diagonal. This is 
now a symmetric matrix. 

4. Create a graph corresponding to the off-diagonal elements of the averaged 
weighted adjacency matrix 0*. This can be done with qgraph in R 49 . 



Validation study. We generated data from the three most popular types of network 
architectures: random networks, scale-free networks, and small world (clustered) 
networks 30 " 32 . Figure la shows illustrative examples of each type of networks. 
Network sizes were chosen to be comparable to the most common number of items in 
symptom checklists (10, 20, and 30 nodes), but also large networks were included (100 
nodes). The level of connectivity of the networks was chosen to generate sparse 
networks. For this reason, in case of random networks, the probability of a connection 
(P C om) between two nodes was set to 0.1, 0.2, and 0.3. For small world networks, the 
neighbourhood was set to 2, and for scale-free networks only one edge is added per 
node at each iteration in the graph generating process. To obtain a wide variety of well 
known graph structures, the rewiring probability (P revi ., re ) in small world networks 
was set to 0.1, 0.5 and l,and the power of preferential attachment (P att ach) m scale-free 
networks was set to 1, 2 and 3. For the condition with 100 nodes, we used different 
levels of connectivity for random and scale-free networks (random networks: P contl — 
.05, .1, and .15; scale-free networks: P at t a ch ~ 1> 1-25, and 1.5). Otherwise, nodes will 
have too many connections. 

The generated networks are binary: all connections have weight 1 or 0. To create 
weighted networks, positive weights were assigned from squaring values from a 
normal distribution with a mean of 0 and a standard deviation of 1 to obtain weights 
in a realistic range. Examples of resulting weighted networks are displayed in 
Figure lb. Besides weights, thresholds of the nodes are added. To prevent nodes with 
many connections to be continuously activated and consequently having no variance, 
thresholds were generated from the normal distribution between zero and minus the 
degree of a node. 

From the weighted networks with thresholds, data was generated according to the 
Ising model by drawing samples using the Metropolis- Hastings algorithm, imple- 
mented in R using the IsingSampler package 50 " 52 . Four sample size conditions were 
chosen that are realistic in psychology and psychiatry: 100, 500, 1000, and 2000. The 
generated data were used to estimate networks with eLasso. Examples of the resulting 
estimated networks are displayed in Figure lc. 

This setup led toa3X4X3X4 quasi-factorial design, with the factors network 
type (random, small world, scale-free), level of connectedness, network size (10, 20, 
30, 100), and sample size (100, 500, 1000, 2000). Thus, the total simulation study 
involved 144 conditions. Each of these conditions was replicated 100 times. For each 
condition, the mean correlation between data generating and estimated parameters, 
the mean sensitivity, and the mean specificity is computed. These served as outcome 
measures, indicating the quality of network recovery. Sensitivity, or the true positive 
rate, is defined as SEN — TP/(TP + FN), in which TP is the number of true positives 
and FN is the number of false negatives. Specificity, or the true negative rate, is defined 
as SPE = TN/(TN + FP), in which TN is the number of true negatives and FP is the 
number of false positives. Note that, in order to compute sensitivity and specificity, 
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the off-diagonal elements of the weighted adjacency matrix 0* have to be 
dichotomised. 

Since specificity naturally takes on high values for sparse networks, also the Fl 
score is computed. For more details about the Fl score and the results, see 
Supplementary Information online. 

Data description. We used data from the Netherlands Study of Depression and 
Anxiety 36 (NESDA). This is an ongoing cohort study, designed to examine the long- 
term course and consequences of major depression and generalised anxiety disorder 
in the adult population (aged 18-65 years). At the baseline assessment in 2004, 2981 
persons were included. Participants consist of a healthy control group, people with a 
history of depressive or anxiety disorder and people with current depressive and/or 
anxiety disorder. 

To demonstrate eLasso, we selected individuals from NESDA with a current or 
history of depressive disorder and healthy controls. To this end, we excluded everyone 
with a current or history of anxiety disorder. The resulting data set contains 1 108 
participants. To construct a network we used 27 items of the self-report Inventory of 
Depressive Symptomatology 35 that relates to symptoms in the week prior to assess- 
ment (IDS). 

Data were dichotomised in order to allow the application of the Ising model. 
Therefore, the four response categories of the IDS items were recoded into 0 and 1. 
The first response category of each item indicates the absence of the symptom. In the 
case of "feeling sad", the first answering category is "I do not feel sad". This option is 
recoded to 0, since it indicates the absence of the symptom. The other three options 
("I feel sad less than half the time", "I feel sad more than half the time", and "I feel sad 
nearly all of the time") are recoded to 1, indicating the presence of the symptom to 
some extent. Other items are recoded similarly. 

Analysing the dichotomised data with our method and visualising the results with 
the qgraph package for R 49 , results in the network in Figure 3. The layout of the graph 
is based on the Fruchterman-Reingold algorithm, which iteratively computes the 
optimal layout so that nodes with stronger and/or more connections are placed closer 
to each other 53 . This network conceptualisation of depressive symptomatology might 
give new insights in issues that are still unexplained in psychology. 
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